1,986 research outputs found

    Guided Stereo Matching

    Full text link
    Stereo is a prominent technique to infer dense depth maps from images, and deep learning further pushed forward the state-of-the-art, making end-to-end architectures unrivaled when enough data is available for training. However, deep networks suffer from significant drops in accuracy when dealing with new environments. Therefore, in this paper, we introduce Guided Stereo Matching, a novel paradigm leveraging a small amount of sparse, yet reliable depth measurements retrieved from an external source enabling to ameliorate this weakness. The additional sparse cues required by our method can be obtained with any strategy (e.g., a LiDAR) and used to enhance features linked to corresponding disparity hypotheses. Our formulation is general and fully differentiable, thus enabling to exploit the additional sparse inputs in pre-trained deep stereo networks as well as for training a new instance from scratch. Extensive experiments on three standard datasets and two state-of-the-art deep architectures show that even with a small set of sparse input cues, i) the proposed paradigm enables significant improvements to pre-trained networks. Moreover, ii) training from scratch notably increases accuracy and robustness to domain shifts. Finally, iii) it is suited and effective even with traditional stereo algorithms such as SGM.Comment: CVPR 201

    Real-time self-adaptive deep stereo

    Full text link
    Deep convolutional neural networks trained end-to-end are the state-of-the-art methods to regress dense disparity maps from stereo pairs. These models, however, suffer from a notable decrease in accuracy when exposed to scenarios significantly different from the training set, e.g., real vs synthetic images, etc.). We argue that it is extremely unlikely to gather enough samples to achieve effective training/tuning in any target domain, thus making this setup impractical for many applications. Instead, we propose to perform unsupervised and continuous online adaptation of a deep stereo network, which allows for preserving its accuracy in any environment. However, this strategy is extremely computationally demanding and thus prevents real-time inference. We address this issue introducing a new lightweight, yet effective, deep stereo architecture, Modularly ADaptive Network (MADNet) and developing a Modular ADaptation (MAD) algorithm, which independently trains sub-portions of the network. By deploying MADNet together with MAD we introduce the first real-time self-adaptive deep stereo system enabling competitive performance on heterogeneous datasets.Comment: Accepted at CVPR2019 as oral presentation. Code Available https://github.com/CVLAB-Unibo/Real-time-self-adaptive-deep-stere

    Learning monocular depth estimation with unsupervised trinocular assumptions

    Full text link
    Obtaining accurate depth measurements out of a single image represents a fascinating solution to 3D sensing. CNNs led to considerable improvements in this field, and recent trends replaced the need for ground-truth labels with geometry-guided image reconstruction signals enabling unsupervised training. Currently, for this purpose, state-of-the-art techniques rely on images acquired with a binocular stereo rig to predict inverse depth (i.e., disparity) according to the aforementioned supervision principle. However, these methods suffer from well-known problems near occlusions, left image border, etc inherited from the stereo setup. Therefore, in this paper, we tackle these issues by moving to a trinocular domain for training. Assuming the central image as the reference, we train a CNN to infer disparity representations pairing such image with frames on its left and right side. This strategy allows obtaining depth maps not affected by typical stereo artifacts. Moreover, being trinocular datasets seldom available, we introduce a novel interleaved training procedure enabling to enforce the trinocular assumption outlined from current binocular datasets. Exhaustive experimental results on the KITTI dataset confirm that our proposal outperforms state-of-the-art methods for unsupervised monocular depth estimation trained on binocular stereo pairs as well as any known methods relying on other cues.Comment: 14 pages, 7 figures, 4 tables. Accepted to 3DV 201

    Urban Geology for the Enhancement of the Hypogean Geosites: the Perugia Underground (Central Italy)

    Get PDF
    AbstractUrban geology analyses natural risks and promotes geoheritage in urban areas. In the cities, characterized by a high cultural value, the hypogean artificial cavities, often present in the downtown, offer a unique opportunity to show the geological substratum. Moreover, these places could be a point of interest in urban trekking with the abiotic component of the landscape as a topic (geotourism). To investigate these areas, rigorous bibliographic research and a geomorphological assessment are the first steps, but, besides, non-invasive methods are new techniques increasingly in demand. In this paper, we present a multidisciplinary study on the Etruscan Well (third century B.C.), one of the most important Etruscan artefacts in Perugia (Umbria region, Central Italy). The characteristics of the sedimentary deposits outcropping along the perimeter walls have been collected. Moreover, to show the underground geoheritage, we provide a 3D model of the well and the surrounding area integrating a georeferenced laser scanner survey with ground-penetrating radar prospecting. We aim to obtain a tridimensional mapping of accessible internal rooms to depict the geological characteristics of the Etruscan Well, also revealing a surrounding network of buried galleries. The results are not only a meaningful advancement in the archaeological, geological and historical knowledge of the downtown of Perugia but are a hint for the geoheritage promotion and dissemination, providing images and 3D reconstruction of underground areas

    GO-SLAM: Global Optimization for Consistent 3D Instant Reconstruction

    Full text link
    Neural implicit representations have recently demonstrated compelling results on dense Simultaneous Localization And Mapping (SLAM) but suffer from the accumulation of errors in camera tracking and distortion in the reconstruction. Purposely, we present GO-SLAM, a deep-learning-based dense visual SLAM framework globally optimizing poses and 3D reconstruction in real-time. Robust pose estimation is at its core, supported by efficient loop closing and online full bundle adjustment, which optimize per frame by utilizing the learned global geometry of the complete history of input frames. Simultaneously, we update the implicit and continuous surface representation on-the-fly to ensure global consistency of 3D reconstruction. Results on various synthetic and real-world datasets demonstrate that GO-SLAM outperforms state-of-the-art approaches at tracking robustness and reconstruction accuracy. Furthermore, GO-SLAM is versatile and can run with monocular, stereo, and RGB-D input.Comment: ICCV 2023. Code: https://github.com/youmi-zym/GO-SLAM - Project Page: https://youmi-zym.github.io/projects/GO-SLAM
    • …
    corecore